As you may know, Machine Learning can take many forms and perform a wide variety of tasks. However, before you build you first model, you first have to understand how to train it. In this tutorial, we'll discuss the different ways of training ML models. This tutorial will cover the following learning objectives:
Supervised Learning
Unsupervised Learning
Reinforcement Learning
Supervised Learning
Summary
In Supervised Learning, the computer learns by making use of labeled data. In this context, Labels are what you're trying to predict. For example, if you're building a model to predict house prices, the input data would include house listings with the prices included so the computer can identify trends between the price and other attributes, or features, provided in the listing.
Data is a collection of information used by ML models to output predictions. This may include user data, text data (such as from Product Reviews on Amazon), images, metadata (such as image attributes like size, color, and location), and JSON data produced by websites.
Features are the "X" attributes that are used to predict the "Y" output. For example, if you're trying to predict house prices, some features may include the size, number of bathrooms, number of bedrooms, and the location.
In Supervised Learning, the target variable, or attribute, is what you're trying to predict. Whether you're trying to classify a set of observations, or predict a value on a continuous scale, the label is used to guide the computer in identifying trends between the features and the target.
A sample, or observation, is a single record, or row, of data that represents a single entity. For example, if you had a dataset of 100 house listings, one sample, or row, in the dataset would represent a single house listing.
In Supervised Learning, the Training Data contains the "target" variable with the labels included, whereas the Testing Data includes a new set of observations that has masked labels, or labels that you can see, but the computer cannot. This allows you to evaluate the performance of your model more effectively.
There are two types of Supervised Learning: Classification and Regression. With Classification, you're trying to predict a discrete. class label based on the various features. An example of this could be predicting what breed a dog is based on its height, weight, and length. With Regression, you're trying to predict a continuous value based on the various features. An example of this is predicting the price of a used car based on its mileage, age, and title status.
Discrete values are distinct numbers or classes that can represent the output. These are typically represented either by integers or text-based classes. Continuous values are numbers on a number line. These are alomost always represented by floating point numbers.
Unsupervised Learning
Summary
In Unsupervised Learning, the computer learns by making use of unlabeled data.
Clustering and Object Identification algorithms are common examples of Unservised learning, as the computer needs to find trends between samples without any guidance. This is useful for reducing bias in training samples and gaining data-driven insights.
Clustering is the process of grouping a set of samples in such a way that similar objects fall into the same group. Foe example, suppose you have a dataset of 500 songs. Your goal is to assign each song to a particular genre. However, you don't know which song is currently assigned to which genre. You can use a clustering algorithm to group songs with similar characteristics to create genre groups.
Unsupervised learning can be used to speed up outlier detection. Outliers are data points that exist two standard deviations above or below the mean. Outliers can produce bias within Supervised Learning models, so this technique is commonly used when building Supervised Learning models.
Latent variables are variables that aren't directly measured on an individual sample but are rather inferred from other variables. An example of a latent variable could be the "health status" of an individual based on the amount of time exercising, amount of ruit and vegetables consumed, and illness frequency.
Autoencoding is a process used by Neural Networks to reduce noise. This topic is dicussed in the "Deep Learning" section of this Tutorial Topic.
Reinforcement Learning
Summary
In Reinforcement Learning, the computer learns by making mistakes and getting rewarded. Rather than retrieving Training Data, the computer receives a task to complete and then goes through a series of trial and error, receiving rewards for when it succeeds, in order to learn.
In the real world, children learn most things by getting rewards for doing something correctly, and reaping natural consequences for doing something incorrectly. A similar pattern is followed when training models to complete complex tasks. The model receives some sort of positive signal when it's going in the right direction, and a negative signal when it does the opposite.
In the context of Reinforcement Learning, an Agent is the machine that is performing the task and receives feedback. Agents perform tasks based on inputs known as states.
In the context of Reinforcement Learning, the Environment is the virtual, or sometimes physical, location where the machine performs the assigned task.
Just like showing your work in Algebra class, the same principle applies to Reinforcement Learning models. If the Agent performs a task successfully but doesn't know how it accomplished it, how can it do it again? States track the steps taken by the Agent in the specific iteration. This allows the Agent to analyze the steps and the result to reproduce the results in different scenarios. This process keeps going until the Agent performs the task successfully, with minimal error, in every assigned scenario (such as different terrain, different times of year, or with obstacles).
Credit Assignments are very similar to Exams. They test the Agent's ability to perform the assigned task in a realistic environment. Values are numerical representations given to possible rewards. Some rewards are more risky but provide much greater performance, whereas others guarantee successful outcome, but the performance may be lacking. Policies are assigned to the Agent to go after specific values based on their risk-reward criteria. The Agent iterates through every possible value and policy to find the most efficient way to perform the task assigned.
In Reinforcement Learning, Exploitation is where the agent takes the first path it finds successful. Although this guarantees success, it can also be very inefficient in reaching the goal. With Exploration, the Agent is able to explore new ways of finding success while optimizing its rewards.